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Abstract 


The problem of time series approximation by series of finite rank is considered from the viewpoint 
of signal extraction. For signal estimation, a weighted least-squares method is applied to the 
trajectory matrix of the considered time series. Matrix weights are chosen to obtain equal or 
approximately equal weights in the equivalent problem of time-series least-squares approximation. 
Several new methods are suggested and examined together with the Cadzow’s iterative method. The 
questions of convergence, computational complexity, and accuracy are considered for the proposed 
methods. The methods are compared on numeric examples. 

1 Introduction 

Consider the problem of extracting a signal S = (si,..., sat) from an observed noisy series X = S-l-N, 
where S is governed by a linear recurrence relation (LRR) of order r: 


r 



n = r + 1,..., N] Or / 0. 


2 = 1 


Generally, series, which are governed by LRRs, may be written in a parametric form 



( 1 ) 


where Pi{n) are polynomials of n. However, a parametric regression approach for the problem does 
not lead to accurate estimation of parameters due instability of estimates. 


It is known that methods based on signal subspace estimation (subspace-based methods) work well 
[SlEniEllII]. These subspace-based methods use the following approach. Let us fix a window length 
L, 1 < L < N, set K = N — L + 1, and build the trajectory matrix for the series §: 


S 


(Si S2 ... Sk ^ 

S2 S3 ■ ■ ■ SK+l 


\SL SL+1 ■ ■ ■ SM / 


Note that S G 72, where % is the set of Hankel matrices with equal values on their anti-diagonals 
i + j = const. Let § be governed by an LRR of order r, r < mm{L, K), and be not governed by an 
LRR of smaller order. Then rank S = r and therefore S is a Hankel matrix of low-rank r. The column 
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space of S, that is, the signal subspace, provides estimates of a* and oji in 0 by the ESPRIT method 
[171 [I3] applied to S. 

Let X be the trajectory matrix of the series X. Then the problem of estimation of S and the signal 
subspace can be considered as a problem of approximation of the matrix X by a Hankel matrix of 
rank not larger than r; 

||X-Y|||^ min , (2) 

rankY<r 

Y&H 


where || • ||f is the Frobenius norm. 

Many papers are devoted to this problem, e.g., oiiniiiaE] among others, where the problem is 
called Structured Low-Rank Approximation. Numerical solutions of the problem are iterative; e.g., 
the Cadzow iterative method |3] consists of alternating projections to the sets of Hankel matrices and 
of matrices of rank not larger than r. The target function is not unimodal in such class of problems, 
and convergence to the global minimum is not guaranteed; despite this, the problem Q is considered 
to be well-researched, though it still has many open questions. 

Note that the problem Q is equivalent to the problem of weighted approximation of the series 
X = (xi,.. .,xn)- 

N 

E Wiixi — Vi )'^min , (3) 

^ ^ Y:rankY<r ^ 

Y&n 


where 



^N-i + 1 


for z = 1,..., L — 1, 
for i = L,..., iL, 
for i = iL -|- 1,..., 


(4) 


and Y is the trajectory matrix of the series Y. 

The weights 0 at both ends of the series are smaller than that in the center, i.e. the ordinary 
least-square problem Q for matrices corresponds to a weighted least-squares problem for series. 

The aim of this paper is to consider methods which solve the problem with equal weights instead 
of Wi and then to compare the constructed methods in terms of accuracy of the signal estimation. 
All described methods are iterative. If one is interested in a signal estimate, which is not necessarily 
governed by an LRR, then the first iteration can be taken as a low-cost estimate of the signal. Hence, 
the described methods are compared by accuracy of the signal estimation at the first iteration and in 
the limit. Note that Singular Spectrum Analysis (SSA) [211201EJ HU EJ US] applied to the problem of 
signal estimation can be represented as the first iteration of the Cadzow method. 

The structure of the paper is as follows. In Sectionthe problem of approximating a matrix by a 
Hankel rank-deficient matrix is considered. The common structure of iterative alternating-projection 
algorithms is described, approaches to construction of the projectors are given, the convergence theo¬ 
rem is proved. 

In Section the relation between the problems of approximation of time series and of their 
trajectory matrices is described. The relationship between weights in equivalent weighted least-squares 
problems is also given. Section contains the suggested time-series approximation algorithms. In 
Section a numeric comparison of algorithms on a typical simulated example is performed. Section 
contains an example with analysis of real-life data. 

The paper is summarized and conclusions are drawn in Section Supplementary results on SSA 
separability, which has a connection with the convergence rate, are proved in Appendix 
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2 Approximation by rank-deficient Hankel matrices 


2.1 Common scheme of iterations 

Consider the problem of projecting a point x to a set % H At in a Hilbert space X with a inner 
product (•, •), where H and At are closed under the limit operation, H is linear subspace, while At is 
closed with respect to scalar multiplication, i.e. if z G At, then az G At for any a. Note that At is 
not necessarily a linear space or a convex set. 

Thus, the problem is formulated as 

l|x — y|| —)• min over y G H n Al, (5) 

y 

where || • || is the norm corresponding to the inner product. 

To present the algorithm’s scheme for the solution of this problem, let us introduce the projectors 
to the subset A4 and subspace H with respect to the norm || • ||: H^j is the projector to A4, H-^ is the 
projector to %. Note that if the projection to A1 is not uniquely defined, then we suppose that in the 
case of ambiguity any closest point is chosen. The projector to % is evidently orthogonal, while 
is orthogonal due to the following proposition. 

Proposition 1. Let X be a Hilbert space, JH cX be a subset closed with respect to scalar multiplica¬ 
tion, n_A 4 he the projection operator to M.. Then for any x gX the following equation (“Pythagorean 
equality”) is true: ||a;|p = ||a: — n_A 4 a;|p + ||n_A 4 a;|p. 

Proof. Define y = n_A 4 X. Since 


X 


= ||x-y||2 + ||y||2 + 2(x-y,y), 


we should prove that (x — y, y) = 0. Assume the opposite: (x — y, y) / 0. Then for 


7 = 


(x,y) 

(y>y) 


(x — 7 y, 7 y) = 0 and therefore ||x — y|p > ||x — 7 y|p: 


l|x-yf - ||x- 7 yf = 

(y,y)-2(x.y> + hA! = hrf>0. 


(y,y) 


(y>y) 


Since 7 y lies in M according to the property of A4, the contradiction with the fact that y = Hy/jx is 
the closest point to x is acquired. □ 

Remark 1. The proof of Proposition^ yields that for any y G M one can perform an adjustment 
Ay) = £ A4 such that A{y) is not further from x than the original y. Moreover, A{y) is 

orthogonal to x — A{y). 

Let us consider the iterative method of alternating projections for the problem ([^, which is given 
by the following iteration step: 


yfc+i = n^ny^y^,, where yg = x. (6) 

In the following theorem, we investigate convergence of the sequence Q. 

Theorem 1. Let the conditions of Proposition^^ he fulfilled and also the set A4 and the space TL he 
closed under the limit operation. Then 
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1. WVk — n^y^ll —)• 0 OS A: —>■ +oo, WliMt/k ~ l/fc+ill 0 as k ^ + 00 . 

2. Let Ai (1 Bi be a compact set, where Bi = {z : \\z\\ < 1} is the closed unit ball. Then there 

exists a convergent subsequence of points ... such that its limit y* belongs to Ad Cl TL. 

Proof. Let us use the following inequalities: 

Ilyfc - ^MYkW > W^MYk - yfc+ill > 

Wyk+i-^Myk+iW- (7) 

Indeed, since the projection n_yv(Z is not further from z than any other point from Ai and the similar 
statement is valid for we have Hn^A^y^ — z|| > ||z — n_A 4 z||, where z = y^+i, and Hy^, — z|| > 
||z - n^z||, where z = n^y^,. 

1. According to inequalities Q, the sequences ||y^, — n_A 4 yfc||, k = 1,2,..., and ||n_A 4 yfc — yfc+i||, 
k = 1,2,..are non-increasing. It is obvious that they are limited below by zero. Therefore, 
they have the same limit c due to Q. 

Let us prove that c = 0 assuming the opposite c > 0. Then there exists d > 0 such that 
||y^ — n_A 4 yfc|| > d and UlI^Ky;, — y^+iH > d for any k = 1,2,.... In accordance to Proposition]^ 
the following equality is valid: Ily^lP = Hy^ — + ||n_A 4 yfc|p. Since the space TL is 

linear, the following equality is valid too: llll^y^p = Hn^y;;, - Il-ullMyk\? + \\^-H^Myk\? = 
W^MYk - yfc+if + llyfc+ilP- Therefore, 

Ilyfcf = W'^MYkf + Ilyfc - ^MYkf = 

Ilyfc - ^MYkf + linA^yfc - Yk+if + llyfc+if • 

Thus, ||yfc_|_i|p < IlyfclP — 2d^. Expanding this inequality by the same way, we obtain that 
hk+jf < lly/cf - 2jd2 for any j = 1,2,.... Choose A; = 1, and j = \\\y{2d‘^)'] + 1. Then 
||yfc+j|P < 0, which is impossible. Thus, c = 0. 

2. Consider the sequence Tlj^y^, k = 1,2,..., which is bounded, since ||n_A 4 z|| < ||z|| (by Propo¬ 
sition]^ and lln-^zll < ||z|| for any z G X. The sequence belongs to a compact set, since Ai 
is closed with respect to scalar multiplication, and we can resize the unit ball to cover the se¬ 
quence. Then a convergent subsequence (n_^yj^) can be chosen; denote by y* G Ad its limit 
and notice that ||n_.v(yij, — y^+ill = WLlMYit, ~ ^'uBMYi^W —)• 0 as A; —>■ -|-oo. Since TL is closed, 
and X is a Banach space, the projector is a continuous mapping. Taking into consideration 
that ||z — n-^zll is a composition of continuous mappings, we obtain that ||y* — n-^y*!! = 0, 
y* G Ad n LA. Finally, II-^ is a continuous mapping and therefore the sequence {IlyYly 4 y^^) 
converges to y*. Thus, y^^+i is the required subsequence. 

□ 

Actually, Proposition ]^ was in fact proved in [7] for a particular case, while inequalities Q are 
extensions of [H inequalities (4.1)]. 

Let us apply Theorem]^ to the case of matrix approximation by rank-deficient Hankel matrices. 
Let X = i.e. X be the space of matrices of size L x K equipped with some inner product, 

LA C be the space of Hankel matrices. Ad = Ad^ C be the set of matrices of rank not 

larger than r. Then the iterative step ]^ for method of alternating projections has the following form: 

Yfc+i = UnBMYk, where Yq = X G R^^^. 

It is well known that the set Ad^ is closed with respect to the conventional Frobenius norm and 
therefore is closed to any norm, since in the matrix space all the norm are equivalent. The closed 
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unit ball is obviously a compact set in finite-dimensional Euclidean space. Therefore the conclusion 
of Theorem holds. Note that the existence of a convergent subsequence can be deduced from [3]. 
However, our proof of this fact is based on different assumptions; in particular, we stress on the 
Pythagorean equality for projections to sets which are closed with respect to multiplication. 

In this paper, we consider norms (semi-norms) in X generated by weighted Frobenius inner products 
in the form, which is parameterized by a matrix M with positive (non-negative) entries rriij: 

L K 

(Y, Z)m = EE mi,kyi,kZi,k- (8) 

1=1 k=l 

Therefore, the conclusion of Theorem holds if the weights rriij are positive. 

2.2 Evaluation of projections 

Let us consider the weighted norm II'llM generated by Q, that is, ||X|p = ||X||^ = Ylk=i''^i,kxf j.- 

2.2.1 Projector H^. 

It is easy to show that H-^ can be evaluated explicitly using the following proposition. 

Proposition 2. For Y = UyY we have 

Y,l,k:l+k=i+j''^l,kyi,k 

Vij = ^ 

2—il,k:l+k=i+j ^l,k 

It is impossible to derive an explicit form of in the case of arbitrary weights. Consider one 
specific case and suggest an iterative approach to the general case. 


2.2.2 Case of the explicit form of the projector H^^. 


For equal weights rriij = 1, denote H^ = It is well-known that the projector H^Y can be 

evaluated as the sum of r leading components of the singular value decomposition (SVD) of the 
matrix Y. More precisely, let L < iL for simplicity and Y = USV"*" be the SVD, where U is an 
orthogonal matrix of size L x L, S is a quasi-diagonal matrix of size Lx K with non-negative diagonal 
elements (cxi,... ,cfl) in non-increasing order, and V is an orthogonal matrix of size K x K. Denote 
by Xr = (o'lf,) the following matrix: 



if i = j,i < r, 
otherwise. 


Then the projection can be evaluated as Hj-Y = US^V"'". The next proposition describes the case 
when evaluation of a projector Hmt is reduced to application of the projector H^. 

Proposition 3. Let there exist a symmetric positive semidefinite matrix C of size K x K sueh that 
for a given M the equality ||Z||^ = tr(ZCZ'^) holds for any matrix Z G Suppose that the 

column space of a matrix Y lies in the column space of the matrix C. Then 

n^,Y = (n,B)(oT)t, (9) 

where Oc is a matrix such that C = O^Oc, B = YO^, (Oq)^ denotes Moore-Penrose pseudoinverse 
to the matrix Oj. 

Proof. The proof is a direct consequence of the fact that the considered norm is generated by an 
oblique inner product in the row space of Y, see details in [121H]- El 

Remark 2. In fact, the condition ||Z||^ = tr(ZCZT) of Proposition can he fulfilled only if C is 

diagonal and M has a specific form, see Proposition 
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2.2.3 The projector in the general case. 

Since the projector can not be found explicitly for arbitrary weights rriij, iterative algorithms are used 
in the general case. One of these algorithms is described in [18]. Denote by © the element-wise matrix 
product. 

Algorithm 1. Input: initial matrix Y, rank r, weight matrix M, stop criterion STOP. 

Result: Matrix Y as an estimate o/n_v(^Y. 


1. Yo = Y, fc = 0. 

2. Y/^_|_]^ — n^(Y O IVI Y/^ O (Q ]VI))j where Q G is the matrix of all ones; fe •(— + 1. 

3. If STOP, then Y = Y^.; else go to 2. 

Note that in the case, when mij are equal to either 0 or 1, Algorithm is an EM-algorithm |18j : 
hence, properties of EM-algorithms are carried out and the sequence Y^ converges to a local minimum. 
Eormally, it does not matter what values are in Y at positions of zero weights. However, these values 
can influence the algorithm’s convergence rate and the limiting values. 


3 Time series and problem of matrix approximation 

3.1 Problem statement for time series 

Consider a time series X = (xi,... ,xn) of length N > 3. Let us hx a window length L, 1 < L < A, 
denote K = N — L + 1. Also consider a sequence of L-lagged veetors: 

Xi = {xi,... ,Xi+L-i)^, i = l,...,K. (10) 

Dehne an L-trajectory matrix of the series X as X = [Xi Xk]- 

Suppose that 0 < r < L. We say that the series X has L-rank r if its L-trajectory matrix X has 
rank r. 

Note that the series X can have L-rank r only when 

r < min(L, K). (11) 

Further we suppose that L is not larger than K, since the problems of approximation of X and 
X"*" coincide. 

Let Xtv be the set of time series of length N, X)y be the set of time series of length N which has 
L-rank not larger than r. For a given time series X G Xjv, a window length L, 1 < L < A, and a rank 
r satisfying condition ([n]), consider the problem: 

N 

/gW ^ min , fq{Y) = ^ qi{xi - yif, (12) 

^ ^ i=i 

where Y = (yi,... and qi,...,qN are some non-negative weights, qi > 0, i = 1,...,A. The 
squared Euclidean distance to X in is one of reasonable target functions. It coincides with /q(Y) 
when qi = 1, i = 1,... ,N. 


Adjustment. Let an estimate Y G Xjy of the solution of the problem (12) for approximation of 
X G Xat be obtained. Then, according to Remark [T| the estimate can be adjusted to obtain a better 
estimate Y* = A(Y), which is called an adjustment ofY. 
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3.2 Equivalent target functions 

Let X = (xi,...,X 7 v) G Xtv be a time series of length N, X. = (xi^k) £ Then there exists a 
one-to-one mapping T between Xjv and Ti, which can be written as 


T(X) = X, where xi^k = xi+k-i- 


Due to this one-to-one mapping, the problem (12) of time series approximation can be expressed 
in terms of matrices. 

In the space Xtv of time series, the target function (12) can be given explicitly /g(Y) = ||Y — X||g 
using a (semi) inner product 

N 

{Y,Z)q = ^qiyiZi, (13) 

i=l 

where qt are positive (non-negative) weights. 

Consider two (semi) inner products in the space of matrices which are extensions of the 

conventional Frobenius inner product. 

Denote, as before. 


L K 

(Y, Z)i,M = (Y, Z)m = ^ ^ mi^kyi,kZi,k- 

1=1 k=l 

for a matrix M G with positive (non-negative) elements and also 

(Y,Z)2,c =tr(YCZT) 


(14) 


(15) 


for a positive (semi)definite symmetric matrix C G 

Note that if the matrix M consists of all ones, i.e. rriij = 1, and if C is the identity matrix, then 
both inner products coincide with the standard Frobenius inner product. 

Proposition 4. 1. LetY = T(Y), Z = T(Z). Then (Y, Z)g = (Y,Z)i^m if and only if 

qi= ^ m,k- (16) 


1<1<L 

l<k<K 


2. The equality (Y, Z)i^]vi = (Y, Z) 2 ,c is valid if and only if the matrix C = diag(ci,..., ck) and 

mi,k = Ck- (17) 


Proof. To prove the first statement, note that 

L K 

(Y, Z)i^M = EE 

i=l j=l 

The proof of the second statement is a consequence of the fact that only for a diagonal matrix C the 
corresponding inner product has a form appropriate to ( |14| ) (see also Remark]^: 

L K 

(V,Z).c = EE Ckyi,kZl,k- 

1=1 k=l 


□ 


Corollary 1. If mij = I, i = I,..., L, j = I,..., K, 
I,... ,N, given by (16) are equal to Wi introduced in 


then the equivalent series weights qi, i = 
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Note that the matrix norm || • ||2,c with a diagonal matrix C is a particular case of the norm 
II • ||i,M- However, this particular case is of special interest, since the corresponding approximation 
problem can be solved by means of the ordinary SVD, see Proposition!^ 


Remark 3. If the condition (16) is carried out and all weights qi and mij are positive, then the 
problem (12) is equivalent to the problem 


/m(Y) ^ 


min , 
YeMrnn 


/m(Y) 


L K 

X - Y||?_m = '^'^- yi,kf- 

l=l k=l 


(18) 


4 Algorithms 


In this section we suggest a range of algorithms for 
X = S + N, where S is a time series of finite rank r 
serve as estimates of the signal S. 


solving the problem (12). In the model of series 
and N is a noise series, results of the algorithms 


4.1 Cadzow iterations 


The aim of the Cadzow algorithm [3] is the least-squares approximation (18) of the trajectory matrix 
of a series with respect to the norm || • ||i,m with the weights mij = 1 (i.e. the algorithm solves the 
problem ([^, which, by Corollary [T corresponds to the problem (|^ (or, the same, to the problem (12) 
with the weights qi = Wi given in (^). The drawback of this algorithm consists in the unequal series 
weights wp. they are larger in the center than at both ends of the time series. Note that smaller 
window lengths leads to more uniform weights. 

Note that in the case of unit weights m^j = 1, the projections H-^ and Hmt — can be easily 
calculated, see Sections [mi and [2X^ 


Algorithm 2 (Cadzow iterations). Input: Time series X, window length L, rank r, stop rule STOPl 
(e.g., given by quantity of iterations). 

Result: Approximation § of time series X by finite-rank series of rank r. 


1. Yq = TX, k = D. 

2. Yfc+i = n-^nrYfc, k^k + l. 

3. If STOPl, then § = else go to 2. 


4.2 Weighted Cadzow iterations 

Let qi = 1, i = 1,..., N,he chosen in (12). According to Proposition the problem (12) is equivalent 
to the problem (18) with weights 

mi,k = -, (19) 

Wl+k-l 


where Wi are introduced in (Q. 


Algorithm 3 (Weighted Cadzow iterations). Input: Time series X, window length L, rank r, stop 
rules STOPl for outer iterations and STOP2 for inner iterations. 

Result: Approximation S of time series X by finite-rank series of rank r. 


1. Yo = TX, k = 0. 

















2. Obtain Z using Algorithm ^applied to for estimation ofUj^^Y^ with stop criterion ST0P2. 

3- Yfc_|_i = k ^ k + 1. 

4- If STOPl, then § = T'^Y^; else go to 2. 


4.3 Extended Cadzow iterations 


Let us introduce the Extended Cadzow algorithm, which presents a different approach to the problem 
(12) with equal weights than the Weighted Cadzow algorithm does. Formally, let the series X be 
extended to both sides on L — 1 measurements with some values having zero weights, i.e., the added 
measurements are considered as gaps. Thus, the length of the extended series X is Y + 2L — 2, and 
the size of its trajectory matrix XisLbyY + L — 1 (instead of Y — L + 1 for the non-extended 
trajectory matrix). 

For the extended series. Algorithm with weights mjj = 7T1 is applied to X, where the series I 
has ones in the place of the series X and zeroes in positions of gaps, i.e. 


1 I <i+ j - L < N, 

TTliJ = < 

I 0 otherwise. 

Algorithm 4 (Extended Cadzow iterations). Input: Time series X, window length L, rank r, stop 
criteria STOPl for outer iterations and STOP2 for inner iterations, left and right extension values 
h]^_i and Mi_i. 

Result: Approximation S of time series X by finite-rank series of rank r. 


1. Yq = 'VK, where X = (L^^-i,X,Mi_i), k = 0. 

2. Obtain Z using Algorithm ^applied to Y^ for estimation ofUMr^k with stop criterion STOP2. 

3. Yfc_|_i = rt-^Z, k i — /c + 1. 

4- Construct Y^ consisting of the columns of the matrix Y^, from L-th to N-th ones. If STOPl, 
then S = T^^Yfc; else go to 2. 


4.4 Oblique Cadzow iterations 

Algorithms considered in this section generalize the conventional Cadzow algorithm based on the 
Euclidean inner product to the use of an oblique inner product given by a matrix C. These algorithms 
can be applied if the conditions of Proposition hold. 

Algorithm 5 (Oblique Cadzow iterations). Input: Time series X, window length L, rank r, matrix 
C = diag(ci,..., ck), where K = N — L -\-1, stop criteria STOPl. 

Result: Approximation S of time series X by finite-rank series of rank r. 


1. Yo = TH, k = 0. 

2. k k 1, where is given by (§. 

3. If STOPl, then § = T^^Y^; else go to 2. 


To solve the problem (12) of approximation of time series with equal weights qi, a proper matrix C 
should be chosen. It is found that there is no such full-rank matrix; therefore, a few variants providing 
approximately equal weights are considered below. 
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4.4.1 Cadzow(a) iterations 

The following lemma describes a case, when the conditions of Proposition are fulfilled and therefore 
the problem ( [l2) ) with equal weights qi is equivalent to the problem 

Lemma 1 ([H]). Let X G Xj\f, X = T(X) G If h = N/L is integer, then for qi = 1 we have 

||X||g = IIXII 2 c, where C = diag(ci,... ,ck) with diagonal elements 


Ck 


1, ifk = jL + 1 for some j = 0,... ,h — 1, 
0, otherwise. 


This approach has an essential drawback. Since zeroes are placed at the diagonal of the diagonal 
matrix C, C has rank h, which is considerably smaller than K. The change of the diagonal zeroes to 
some small a. is suggested in [8] to improve rank-dehciency. 

Let 


Cfc Cfc(Q;) 


1, if /c = fL + 1 for some j = 0,..., h — 1, 
a, otherwise. 


( 20 ) 


Then the matrix C{a) = diag(ci(a),..., ck{o()) with the diagonal given in (20) is of full rank. How¬ 
ever, the corresponding series weights are not equal. 

Let Cadzow(a) denote the iterations performed by Algorithm with the diagonal matrix C = 
C{a). Note that for a = 1 the matrix C(a) is the identity matrix and the Cadzow(a) iterations 
coincide with the conventional Cadzow iterations. 


Degenerate case a = 0. Equality ( |17| ) provides the form of a matrix M to obtain || • ||i,m = || • Ib.c 
in the case a = 0: 


M = 


/I 

0 

0 •• 

• 0 

1 

0 ••• 

... 1\ 

1 

0 

0 •• 

• 0 

1 

0 ••• 

... 1 







... 1 

Vi 

0 

0 •• 

• 0 

1 

0 ••• 

... ij 


( 21 ) 


Remark 4. The optimization problem (18) with the matrix M given in (21) eorresponds to the search 


of an arbitrary (not necessary Hankel) matrix of rank not larger than r, which is closest in the Frobenius 
norm to the matrix 


( xi 


XL+l 


\XL X2L 


XK 


XNj 


( 22 ) 


This problem is quite different from the problem (12) of approximation by finite-rank series. Therefore, 


the Cadzow(0) algorithm does not solve the problem (12). 


4.4.2 Cadzow-C iterations 

Let us correct the rank-deficiency of C(0) by another way. 


To obtain equal series weights qi = 1 in (12), we should choose the weight matrix M in (18) 
with weights defined in (19). Generally, there is no a matrix C providing the equivalent norm 
" 'b.c = II • ||i,M) since the matrix C should be diagonal and therefore the matrix M should have 


columns consisting of equal elements (see Proposition Q . 
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To obtain approximately equal weights, the following approach is suggested. Consider the set 


Z c R 


LxK 


of matrices with columns consisting of equal elements and find M such that 


|M — M|| —)• min, 
Mez 


where || • || is the Frobenius norm. 

The solution M is evidently constructed as the averaging of the matrix M by columns. As a result, 
the resultant matrix C such that II • IL ?. = II • Ik has the form C = diag(ci,..., Ck), where 


12,C 


ll,M 


Ck- L 


1 ^ 

eE 


m,k- 


(23) 


1=1 


We call Algorithm with the matrix C = C Cadzow-C iterations. 


4.4.3 Weights qi in (12) produced by the algorithms 

Since the norm || • || 2 ,c with C(a) or C in place of C does not correspond to equal series weights, 
let us find qi{a) and qi from the equalities ||Y|| 2 g = ||T||q and ||Y|| 2 ^c(a) = ll^ll(}(o)- Formulas for 
calculation are provided in Proposition 
The following statements are valid. 

Proposition 5. Let h = N/L be integer, C(a) = diag(ci(Q:),..., Ci<'(a)), where Ci(a) are given in 
(20), 0 < a < 1. Then the weights qi{a) have the form 

1 + (z — l)a i = 1,..., L — 1, 

qi{a) = < 1 + {L-I)a i = L,... ,K - 1, 

_ 1 + {N — i)a i = K,..., N. 

Proof. The proof is a straightforward consequence of Proposition □ 

To illustrate the form of the weights qi, let us formulate propositions with simplifying conditions. 


Proposition 6. Let N > 3(L — 1). Then the diagonal matrix weights Ck defined in (23) are equal to 


Ck= < 


A J _l_ i 

L yL ' 2-^j=k j 

l/L, 

CK-k+l, 


1 < fc < L - 1, 
L<k<K — L + 1, 
K - L + 2 <k < K. 


Proof. To prove the proposition, it is sufficient to substitute mi^k defined in (19) to (23). 
Proposition 7. Let N > 4(L — 1). Define 


□ 


. ^ ) ~2l? 
1 + 


Ui = 




2iL—i—i^ I L—i 


1 < i < L - 1, 

+ ^{HL-i-Hi_L), L<i<2L-l, 


^ L 

where Hq = 0, and Hi = Yl]=i harmonic number. Then the weights qi have the form: 


qi= { 


1 < i < 2L - 1, 

2L <i < N -2L + 1, 
i+i) Y — 2L + 2 < z < N. 


Ui, 

h 
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Proof. For l<z<L — l,we have 


^ ( j 1 ^ i{i + 1) 

« = = I + Es:l = 

i=i i=i \ k=j 


2L2 


1 1 1 min(/c, i) 

T 2^ h ~ 9r2 ^ T 2^ 


^ k 

k=l j=l 


k=l 


k 


*(* + 1) , ^ n , LT 

-^^ + -{1 + Hl-i-H, 


For L < i < 2L — I, changing the order of summation, we obtain 
L L-l 


Qi — Ci-L+j — Cj + 

j=l j=i-L+l 


i — L + 1 
L 


L-l 


^ L-l L-l ^ 


i — F “t" 1 1 ^ 1 > 

+ 75 E J + 7 E E 


L 


j=i-L+l 


L ^ ^ k 

j=i-L+l k=j 
L-l k 


L 


2F2 


1 

k 


i — F “t" 1 2iL — i — 1 v—> v—^ 

+ +i E E 

k=i—L+l j=i—L-\-l 

= 1 7 


2F2 


F 


The weights qi for A^ — 2F + 2<i<Ai are calculated by symmetry. The center series weights are 
evidently equal to 1. □ 



Index, L = 8 


Figure 1: Normalized series weights qi corresponding to C{a) and C. 
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Let us normalize series weights so that their sums equal 1. The normalized weights qi{a), for a = \ 
(the conventional Cadzow iterations), a = 0 (equal qi), a = 0.1, and qi for N = 40, L = 8, are shown 
in Figure 


4.5 Comments to algorithms. Comparison 

Let us comment and compare the following methods: the Weighted Cadzow iterations (Algorithm]^, 
the Extended Cadzow iterations (Algorithm]^, the Cadzow(a) iterations, 0 < a < 1, coinciding with 
the conventional Cadzow iterations if a = 1, and finally the Cadzow-C iterations (Algorithm]^. Note 
that the window length L is a parameter for each of the considered methods. 


• Theoretical convergence. Theorem [T] provides conditions for the existence of a subsequence, 
which converges to a matrix from Air^TL. This theorem is applicable directly to Algorithm 
if all weights are positive and to Algorithm if to suppose that the weighted projection to Air 
can be calculated with no error. It is easy to extend Theorem to be applicable to Algorithm 
1^ where the weights for added values are zero, if to consider the sequence instead of Y^.. 


• Convergence in practice. Although the theory says about the existence of converging subse¬ 
quences, the convergence of the constructed sequences took place in all the training examples. 

• Comparison by accuracy. The methods are iterative, and convergence to the global minimum 
in the corresponding least-squares problem does not necessarily take place. Therefore, different 
algorithms corresponding to the same weights can yield different approximations. Hence, the 
comparison of the algorithms by the approximation accuracy makes sense. 


• Signal estimation and series approximation. The proposed methods can be considered as both 
approximation methods of the original series by finite-rank series and weighted least-squares 
methods for signal estimation. Note that generally the approximation quality can contradict to 
the estimation accuracy due to possible over-fitting. 


Algorithms and series weights. The Weighted Cadzow and Extended Cadzow methods try to 
solve the problem (12) with equal weights qi. The other methods work with weights with different 
levels of non-uniformity. 


• Algorithms and computational costs. All suggested algorithms are iterative. However, each 
outer iteration in the Weighted Cadzow and Extended Cadzow algorithms has a step with inner 
iterations. Therefore, these algorithms are very time-consuming. The other algorithms do not 
contain inner iterations; moreover, they have similar computational costs of one iteration and 
can be compared by the number of iterations. Computational complexity is described by both 
complexity of one iteration and the number of iterations. Evidently, the necessary number of 
iterations is determined by the convergence rate. 


• Fast implementation. There is a very fast implementation of iterations of the Cadzow algorithm 
suggested in [15] and extended in m- However, it can be shown that the same implemen¬ 
tation approach can be applied to the Cadzow(a) and Cadzow-C algorithms. Therefore, fast 
implementations of these algorithms still can be compared by the number of iterations. 


• Use of the first iteration for signal estimation. One iteration of the Cadzow iterations is exactly 
the well-known Singular Spectrum Analysis (SSA) method, which can solve a significantly wider 
range of tasks than the iterative method does. By analogy, together with the limiting series, we 
are interested in the signal estimation by means of the first iteration of the considered algorithms. 
In a sense, each iterative method produces a modihcation of SSA. The first iteration is generally 
not of finite rank; however, it has low computational complexity and can provide sufficient 
accuracy. 
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• Separability, the first iteration and the convergence rate. Separability of a signal, which is an 
important concept of the SSA method, means the ability of a method to (approximately) separate 
the signal from a residual. From the viewpoint of the iterative methods, the separability quality 
is closely related to the accuracy of the first iteration of the method. On the other hand, we 
can expect that the accuracy of the first iteration is connected with the method’s convergence 
rate. Therefore, the separability accuracy is connected with the convergence rate of iterative 
methods. 


• Separability and choice of parameters. The connection between the separability and the window 
length L is well studied for the SSA method, see [9]. In particular, optimal window lengths are 
close to half of the series length. A small window length L provides poor separability. We can 
expect that this is valid for the other algorithms too. The Cadzow(a) method has an additional 
parameter a. Influence of the parameter a on separability in the class of Cadzow(a) iterations 
is investigated in Appendix]^ The studied example of separability of a sine-wave signal from a 
constant residual shows that small values of a provide poor separability. 

• Equal series weights and choice of parameters. Let us consider the dependence of series weights 
produced by the Cadzow(a) algorithm on the window length L or a. Proposition shows that 
more uniform weights are achieved for small L and small a. This is exactly the case corresponding 
to poor separability. 

• Equal series weights and accuracy of signal estimation. Thus, the weights, which are close to 
equal ones, correspond to algorithms, which either have a time-consuming iteration step with 
inner iterations or are slowly convergent; therefore such algorithms have high computational 
complexity. There are no theoretical results about the behavior of the estimation accuracy in 
dependence on algorithms and their parameters. However, the numerical study shows that the 
best accuracy is achieved in the algorithms corresponding to the weights, which are equal or 
almost equal. 


Remark 5. The adjustment A, which is suggested in Section 3.1 for improvement of estimates, can 
be applied to the resultant signal estimation S for any considered algorithm. The inner product used 
in Remark for definition of A is the standard Euclidean inner product not depending on the weight 
matrix M used in the algorithms, since this norm || • || corresponds to the problem (12) with equal 
weights qi. We will call the algorithms with the adjustment A adjusted algorithms. Eor example, the 
result of the k-th iteration of the Cadzow iterations can be expressed as Sfc = Then 

the result of the k-th iteration of the adjusted Cadzow iterative method is = A{fi>k). 


5 Numerical comparison 

Let us carry out numerical experiments for analysis of the performance of the considered methods. 
Comparison of the methods was performed on several examples, with a sine-wave signal and an 
exponentially-modulated sine-wave signal. Since the obtained comparison results are very similar, 
only the results for a sine-wave signal are presented. 

Suppose that the signal S = (si,..., sm) of length = 40 and rank r = 2 has the form: 

2’7r A* 

Sfc = 5sin—, k = l,...,N, (24) 

D 

and the series X = S -|- N is observed, where N is Gaussian white noise with mean equal to 0 and 
variance equal to 1. Accuracy of a signal estimate S is measured as the root mean-square error (RMSE) 
using 1000 simulations. Comparison is performed on the same simulated samples. It was checked that 
the stated comparison results are signihcant at the 5% level of significance. 
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Convergence rate and accuracy. We start with the investigation of the Cadzow-C method and the 
Cadzow(a) methods for several values of a, since they have not internal iterations and therefore their 
computational costs can be compared by the number of external iterations. These methods use an 
oblique SVD; the Cadzow(l) method is the conventional Cadzow method. Figure]^ shows the rate of 
convergence for a = 0.1 and a = 1 and for two different window lengths L. The RMSE values are 
depicted versus the number of performed iterations. 



Iteration 


Figure 2: The RMSE of the signal estimate depending on the number of iterations, <7 = 1. 


One can see that a method with a smaller limit error is the one with a slower convergence rate. 
Eor parameters involved to the simulations, the Cadzow(O.l) method with the window length L = 8 
has the smallest limit error. At the same time, these values of parameters correspond to both the 
slowest convergence and the most uniform weights. 

Note that the limit errors do not differ strongly, they change from 0.31 (a = 0.1, L = 8) in the 
best case to 0.35 (a = 1, L = 20) in the worst case. However, the error equal to 0.35 is achieved at 
the hrst iteration in the worst case, while it takes 4-5 iterations to achieve the error 0.35 in the best 
case. 


Accuracy vs number of iterations for Cadzow(a). The same signal (24) was taken to investigate 
how the RMSE and the convergence rate depend on a. for the Cadzow(a) algorithms with L = 20. 
The following STOPl criterion was taken; 


lir-bYfc)-r-bYfc+i) 


N 


< 10 “ 


Eigure shows that smaller a leads to more accurate estimates of the signal, but increases their 
computational costs. This general rule sometimes does not work for very small values of the parameter 
a, see Eigure where the noise standard deviation was increased from 1 to 3. One can see that for 
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Figure 3: The RMSE and the average number of iterations depending on a (log-scale), L = 20, <7 = 1. 


a smaller than 0.1 the dependence of the estimation errors on a changes. It seems that the threshold 
a, which corresponds to the change of the accuracy behaviour, depends on the 1-iteration separability 
of the signal from noise. Indeed, as we can expect, for small a the separability quality is poor (see an 
example in Appendix [A]) . 



LU 

cn 

cc 


Figure 4: The RMSE and the average number of iterations depending on a (log-scale), L = 20, <7 = 3. 
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Comparison by accuracy at the first iteration and in the limit. Let us now involve the Extended 
and Weighted Cadzow iterations and examine the spreading of the estimation errors along the series. 
The maximal number of iterations equal to 100 is taken for the stop criterion STOPl (this choice 
yields the error close to the limiting value); the stop criterion STOP2 for inner iterations is as follows: 
-— * • < 10“^. The initial left and right extended values 1^l-i and Ml-i in the Extended Cadzow 

iterations are obtained using the vector SSA-forecasting method m Section 2.3.1]. 

Figuresandshow the dependence of the RMSE on numbers of the series points. Figureshows 
the errors at the first iteration, Figure shows the errors at the 100-th iteration. It is clearly seen 
that the Extended Cadzow method is the most precise in both cases. The Cadzow(l) and Cadzow-C 
methods are the best at the first iteration among the set of methods without inner iterations. The 
best method in the limit (after 100-th iteration, the errors do not change significantly further) is the 
Cadzow(O.l) method; this is not surprising according to Figure]^ 



Figure 5: The RMSE of signal estimates at each series point; iteration 1; L = 20, a = 1. 


Errors for signal and original series approximations. Since we used the least-squares method for 
estimation of the signal S, consider Table lb which shows the RMSE for S as an estimate of S (i.e., the 
signal estimation errors) and the RMSE for § as an estimate of the original series X (i.e., the series 
approximation errors). Here k is the number of iterations, L = 20. Tableconhrms the conclusions 
about comparison of the methods by accuracy of signal estimation. Also it is seen that the quality of 
original series approximation does not always correspond with the quality of signal estimation. For 
example, overfitting is clearly present for the Cadzow(O.l) iterations at the first iteration. However, 
the methods are ordered identically by errors of series approximation and signal estimation in the 
limit. This means that minimization of the error of approximation likely yields minimization of the 
error of signal reconstruction. The same ordering of the errors is very important for practice, since 
for real-life data we can choose a better method and its parameters by smaller approximation errors. 
Certainly, a proper rank should be set before the comparison of the methods. 

The same simulations were performed with the adjusted algorithms (see Remark]^. One can see 
in Table that the accuracy is almost the same. By its dehnition, the adjustment always improves the 
approximation of the original series; however, the influence on the accuracy of signal approximation 
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Figure 6: The RMSE of signal estimate at each series point; iteration 100; L = 20, cr = 1. 


Table 1: Comparison of methods by the RMSE, L = 20, <7 = 1, for the signal (24). 


Method 

E>, k = 1 

X,k = l 

S, A: = 100 

X,k = 100 

Cadzow, a = 1 

0.3758 

0.9195 

0.3782 

0.9664 

Cadzow, a = 0.1 

0.4329 

0.7040 

0.3311 

0.9506 

Cadzow C 

0.3655 

0.8925 

0.3559 

0.9583 

Weighted Cadzow 

0.3644 

0.8891 

0.3455 

0.9549 

Extended Cadzow 

0.3361 

0.9030 

0.3189 

0.9471 


is ambiguous (the adjustment improves the accuracy at the 100-th iteration; results are various at the 
first iteration). 


Thus, the numerical examples mostly confirm the statements itemized in Section 4.5 


6 Real-life example 

Let us consider the series ‘Fortified wine’ (fortified wine sales, Australia, monthly, from January 1980 
till December 1993) [2]. This series has the following structure: a signal consisting of an exponential 
trend and a seasonality of a complex form and of noise. We compare the Cadzow(a) algorithms for 
different a and demonstrate that a smaller a provides a smaller approximation error. 

In Sectionj^we considered a simple example with a signal consisting of one sine wave. However, the 
real-life time series has much more complex form. To confirm the approach that we can minimize the 
approximation errors to diminish the signal estimation error, let us construct a model of the ‘Fortified 
wine’ series and use this model for simulation to check the approach. 

The series ‘Fortified wine’ has been analyzed in several papers (see e.g. [10] for a bit longer 
time series). A typical analysis of the time series by Basic SSA m Chapter 1] with window length 
L = 84 shows that the leading 11 eigentriples correspond to the signal. The ESPRIT method [I71II3] 
applied to the found signal subspace provides estimates of exponential bases pm for the trend and 
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Table 2: Comparison of adjusted methods by the RMSE, L = 20, cr = 1, for the signal (24). 


Method 

S, /c = 1 

X, /t = 1 

S, fc = 100 

X, A; = 100 

Cadzow, a = 1 

0.3714 

0.9175 

0.3667 

0.9622 

Cadzow, a = 0.1 

0.4385 

0.7023 

0.3276 

0.9493 

Cadzow C 

0.3626 

0.8909 

0.3478 

0.9555 

Weighted Cadzow 

0.3640 

0.8883 

0.3380 

0.9523 

Extended Cadzow 

0.3370 

0.9030 

0.3184 

0.9469 


Table 3: Comparison of the errors of signal estimation and series approximation using the Cadzow 
methods, for the ‘Fortified wine’ series and the model realizations. 


Method: 

S 

X 

X* 


Cadzow, a = 1 

127.71 

263.20 

283.58 


Cadzow, a = 0.8 

127.18 

262.98 

283.25 


Cadzow, a = 0.6 

126.42 

262.63 

282.72 


Cadzow, a = 0.4 

125.39 

262.06 

281.77 


Cadzow, a = 0.2 

124.10 

260.94 

279.55 


Cadzow, a = 0.1 

125.09 

260.52 

276.70 


Cadzow, a = 0.05 

129.44 

261.47 

274.00 



for modulations of seasonal components in the series components, where the A:-th term in the mth 
component is given in the form CmPm CmPm sin(27ra;m/c + (/>m), A: = 1,..., iV. The ESPRIT method 
also estimates the frequencies LOm] however, for seasonal components the possible frequencies are known 
and therefore we changed the frequency estimates to nearest values in the form j/12. The coefficients 
Cm before the found series components and the phases 4>m of seasonal components were estimated by 
the least-squares method. Noise is taken multiplicative, that is, its variance increases proportionally 
to the trend. Thus, the model of the signal S = (si,..., sat), = 168, is estimated as 

Sk = 3997.74(0.9967)^+ 

27r /u 

1174.75 (0.9942)^ sin(— - 2.249)+ 

27r /b 

425.75 (1.0001)'= sin(— + 2.333)+ 

Q/TT 

211.55 (1.004)'= sin(-+ 1.677)+ 

6 

27r k 

169.33 (1.0007)'= sin(— + 1.533)+ 

2'7r 

361.07 (0.9884)'= sin(-2.901). 

3 

The model of the whole series X = (xi,..., xn) is Xi = Si + 353.17 (0.9967)'=ej, where Ei, i = 1,..., N, 
is Gaussian white noise with mean equal to 0 and variance equal to 1. We set L = 84 and apply 
the Cadzow(Q:) algorithm with a = 1,0.8,0.6,0.4,0.2,0.1,0.05. The following STOPl criterion is 

taken in Algorithm i — — ^ < 10 The algorithm was applied to 1000 independent 
realizations of the model and also to the original ‘Fortified wine’ series. Table contains the RMSE 
of model signal estimation (the column S), the RMSE of model series approximation (the column X) 
and the approximation accuracy for ‘Fortified wines’ series (the column X*). 

Table 1^ shows that for a G [0.2,1] a smaller approximation error yields a smaller reconstruction 
error. However, for the smaller values a the tendency is broken. Probably, small values of a do not 


19 





























provide a sufficient separability from noise to converge toward the global minimum. 

Figure depicts the approximation of the original ‘Fortified wine’ series X* obtained by the 
Cadzow(0.2) algorithm. The dotted line corresponds to the original series, while the solid line shows 
the finite-rank estimate of the signal of rank 11. One can expect that the Cadzow(0.2) algorithm 
provides one of the most accurate finite-rank estimates of the signal. 



Month 

Figure 7: ‘Fortified wine’ series: application of the Cadzow(0.2) algorithm. 


7 Conclusion 

Several known and new iterative algorithms for approximation of a noisy signal by a finite-rank series 
were considered in the present paper. The approximation was performed by a least-squares method 
and its result was considered as an estimate of the signal. 

We used equivalent statements of the problems for weighted matrix approximation and weighted 
time-series approximation, where equal weights in the least-squares matrix problem correspond to 
unequal weights in the least-squares time series problem, and vice verse. 

A wide range of the iterative algorithms was reviewed with the aim to obtain equal weights in the 
least-squares method applied to time series. Equal weights were formally achieved in the algorithms 
using inner iterations, which converge to a local minimum only and also make the algorithms very 
time-consuming. It appears that the use of methods without inner iterations (Cadzow-type methods) 
leads to approximately equal weights only. 

Convergence of outer iterations by subsequences was proved for the reviewed algorithms. 

Comparison of the accuracy and convergence rate was performed by simulation on the example of a 
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noisy sine-wave signal. The simulation results confirmed the theoretical results. It appears that time- 
series weights, which are closer to equal, provides in the limit more time-consuming and simultaneously 
more accurate methods. Also, the simulations confirm that the convergence rate is in accordance with 
the separability rate. Therefore, for the Cadzow-type methods, there is the correspondence between 
slow convergence, poor separability, inaccurate approximation at the first one iteration and high 
accuracy in the limit; and vice verse. In particular, for the Cadzow iterations, which produce Singular 
Spectrum Analysis for signal reconstruction at the first iteration, the window length equal to half 
of the series length gives poor accuracy in the limit and one of the best reconstructions at the first 
iteration. 


A Separability of sine-wave signal from constant residual for the 
Cadzow(ct) iterations 


Let us consider modifications of SSA, which are produced by the first iteration of the Cadzow(a) 


iterative algorithms described in Section 4.4.1, Recall that the Cadzow(l) iterative algorithm produces 


the conventional Basic SSA method dills], while the first iteration of a general Cadzow(a) algorithm 
can be considered as a particular case of Oblique SSA [12] with the Euclidean inner product in the 
column space and a special inner product in the row space. 

Separability of signals from residuals in SSA is deeply investigated in dlls- Separability of a signal 
means the ability of the method to extract the signal. In fact, separability is related to the accuracy 
of signal estimation obtained at the first iteration of the considered iterative algorithms. Notions of 
exact, approximate and asymptotic (as the series length tends to infinity) separability together with 
examples of the asymptotic separability rates are introduced in m and can be generalized for the 
oblique case. Following by m, we will measure the separability by means of the cosines between L- 
and AT-lagged vectors of the signal and the residual. 

Let C G symmetric positive semidefinite matrix, Xi and X 2 be two different time series 

of length N, X^, be their trajectory matrices. Define the so-called correlation coefficient between 
the i-th and j-th columns as: 

. (v.'.N) 

iixi|iiix|ir 

where is the z-th column of the matrix X^, k = 1,2, (•,•) is the Euclidean inner product 
the Euclidean norm. Define the correlation coefficient between the i-th and j-th rows as: 


(25) 


IS 


Pi,j 


||AM||c||A2d| 


(26) 


where is the i-th row of matrix X^, k = 1,2, and (-jOc is the oblique inner product in 
generated by a matrix C as follows: {X, T)c = XCY'^ (here X and Y are row vectors), || • ||c is the 
norm with respect to this inner product. We say that the series Xi and X 2 are weakly e-separable if 


p = max I max I p^. 

\l<i,j<K ’■ 


max I p^ ,• 


< e. 


(27) 


We are interested in the order of e as A —>■ 00 for different matrices C, where the series k = 1,2, 
consist of the first N terms of infinite series X“. 

Here we apply the theory to an example with a sine-wave signal and a constant residual. By 
analogy with SSA, we can expect that the asymptotic separability rate will be the same if the residual 
is Gaussian white noise. Thus, let X“ = (cos(27ra;fc), fc = 1,2,...) and = (c, c,...). Consider 
A —>■ 00 and L{N), K{N) —>■ 00 such that A = L -|- A — 1. When C is the identity matrix, the answer 
is known: e has order l/min(L, A), i.e. the rate of separability has order 1/A for L proportional to 
A. This result can be found in m Section 6.1]. 
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Let us consider the separability rate for the Cadzow(a) iterations introduced in Section 4.4.1 


Remark 6. In what follows we will use the following denotation: a function f G 0{g{n)) as n —)■ oo 
if there exist k > 0 and no > 0 such that for any n > no the inequality |/(n)| < k\g{n)\ holds; a 
function f G Il{g{n)) as n ^ oo if there exist k > 0 and no > 0 such that for any n > uq the inequality 
|/(n)| > k\g{n)\ holds. 

Proposition 8. Let = (cos(27ra;/c), fe = 1,2,...), where 0 < ta < 0.5, be a sine wave, = 
(c, c,...) be a constant series, L{N), K{N) —>■ oo, where N = L + K — 1, h = h^ = [iV/Lj. Let also 
0 < a = a{N) < 1, and C = C(a) be defined in (20), i.e. C is a diagonal matrix with diagonal 
elements: 


Ck = 


1, if k = jL + 1 for some j = 0,... ,h — 1, 
a, otherwise, 


Then 


1. p given by (27) has the following order: p = O ^max ^^, where 


Cl,k = Cl;n)^k{n) = X] cos(27ra;(j + A: - 1)), 

Cfc = l 


and 


Dl,k = L)Lf^N),K{N) = X] cos^(27ra;(j + k- 1)). 

l<k<K: 

Cfc = l 

2. If h^ is bounded by a constant, then p = O (max (^, ^)) ■ 

3. If there exists small 6, 0 < 6 < 1/2, such that 2L{N)u G R \ (Ufcezl^ - 6,k + 5]) for every N, 
where Z is the set of integers, then p = O ^max (i_Q)jv/L+aio )) • 

Proof. 1. To prove the theorem, we should evaluate the order of the expressions: 


Y.k=^ ^ cos{2TTijjk) 


Pli = 


^(Eg-^cos^ 


E^I Cfc cos(27rw(j + A: - 1)) 


Pid = 


\j (Ef=i Cfc) (Ef=i Ck cos2(27ra;(j + k- 1))) 


(28) 


(29) 


The following trigonometric equalities hold: 


cos{ak + b) = csc(a/2) sin(an/2) cos 


k=l 


an + a + 2b 


(30) 


yy cos^{ak + b) = ^(2n + csc(a) sin(2an + a + 2b) — 
k=l 

— csc(a) sin(a + 26)), (31) 


for any real a, 6 and positive integer n. Therefore, since the series Xi is not constant, the numerator 
in (28) has order 0(1), while the denominator has order Ll{L). Thus, we obtain the order 1/L. 
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To evaluate the order of (29), consider the sum over k such that Cfc = 1 separately: 


K 

Cfc cos(27ra;(j + /c — 1)) 

k=l 


(1 


and 


a) cos(27ra;(j + k — 1))+ 

l<k<K: 

Ck=^ 

+ acos{2TTU){jk — 1)) 

l<k<K 

K 

yy Cfc = (1 — a)h + aK, 

k=l 


(1 - a)0{CL,K) + a 0(1)! 


K 

yy Cfc cos^(27rw(j + A; — 1)) = 

k=l 

(1-a) X; cos^(27ra;(j + k — 1)) + 

l<k<K: 

Ck = ^ 

+ acos^(27ra;(j + A: — 1)) = (1 — + a 

l<k<K 


2. is exactly the maximum of sums, each of h cosine values, therefore, the absolute value of 
Cl,k is not larger than h. Therefore, if h is bounded by a constant, then \Cl,k\ is bounded by the 
same constant, so, Cl,k = 0(1). 

3. The condition 2L{N)ui G R \ (Ufcezt^ “^ guarantees that | csc(7rL(A^)u;)| in (30) for 

Cl^k and | csc{2TrL{N)uj)\ in (31) for k are bounded by a constant; therefore, we obtain an upper 
bound for and a lower bound for Dl^k- Thus, Cl,k has order 0(1), while Dl^k has order 

n{N/L). ’ ’ ’ ’ □ 


Remark 7. Let us suppose that we have chosen L{N) such that p has order max (^i_a)N/L+aK j ■ 

Then the optimal ehoice for L is L k, +4A''(i cO _ Hence, the rate of separability 

has the same order 0(1/N) for a{N) —)• c, where 0 < c < 1 is some constant (however, a smaller c 
eorresponds to a smaller multiplier before 1/iV/ In the case of eonverging to zero a{N) = 0{N~l^), 
the rate of separability becomes equal to 0{N^~^) for 0 < /3 < 0.5 and to 0{1/VN) for jd > 0.5. 
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